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Abstract 

The aim of the manuscript is to characterize monotone metric in the space of Markov map. Here, 
metric is not necessarily Riemanian, i.e., may not be the inner product of the vector with itself. 

So far, there have been plenty of literatures on the metric in the space of probability distributions 
and quantum states. Among them, Cencov and Petz characterized all the monotone metrics in the 
classical and quantum state space. As for channels, however, only a little is known about its geometrical 
structures. 

In that author's previous manuscript , the upper and the lower bound of monotone channel metric 
was derived using resource conversion theory, and it is proved that any monotone metric cannot be 
Riemanian. 

Due to the latter result, we cannot rely on Cencov's theory, to build a geometric theory consistent 
across probability distributions and channels. To dispense with the assumption that a metric is Rie- 
manian, we introduce some assumptions on asymptotic behavior, weak asymptotic additivity and lower 
asymptotic continuity. The proof utilizes resource conversion technique. In the end of the paper, an 
implication on quantum state metrics is discussed. 

1 Introduction 

The aim of the manuscript is to characterize monotone metric in the space of Markov map. Here, metric 
means the square of the norm defined on the tangent space, and not necessarily Riemanian, nor induced 
from an inner product. 

So far, there have been plenty of literatures on the metric in the space of probability distributions and 
quantum states. Cencov, sometime in 1970s, proved the monotone Riemanian metric in probability dis- 
tribution space is unique up to constant multiple, and identical to Fisher information metric [S]. He also 
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discussed invariant connections in the same space. Amari and others independently worked on the same ob- 
jects, especiahy from differential geometrical view points, and applied to number of problems in mathematical 
statistics, learning theory, time series analysis, dynamical systems, control theory, and so on[T][5]. Quantum 
mechanical states are discussed in literatures such as [2] [6] [9] [9] [15] . Among them, Petz [15] characterized all 
the monotone Riemanian metrics in the quantum state space using operator mean theory. 

As for channels, however, much less is known. To my knowledge, there had been no study about axiomatic 
characterization of distance measures in the classical or quantum channel space, except for the author's 
manuscript [Tlj . In that manuscript, the upper and the lower bound of monotone channel metric was derived 
using resource conversion theory, and it is proved that any monotone metric cannot be Riemanian. 

The latter result has some impact on the axiomatic theory of the monotone metric in the space of classical 
and quantum states, since both Cencov [5] and Petz |15) assumed metrics are Riemanian. Since classical and 
quantum states can be viewed as channels with the constant output, it is preferable to dispense with this 
assumption. Recalling that the Fisher information is useful in asymptotic theory, it would be natural to 
introduce some assumptions on their asymptotic behavior. Hence, we introduced weak asymptotic additivity 
and lower asymptotic continuity. By these additional assumptions, we not only recovers uniqueness result 
of Cencov [5J, but also proves uniqueness of the monotone metric in the channel space. 

In this proof, again, we used resource conversion technique. A difference from usual resource conversion 
technique is that asymptotic continuity is replaced by a bit weaker lower asymptotic continuity. The reason 
is that the former condition is not satisfied by Fisher information. 

In the end, there is an implication on quantum state metrics. 



2 Notations and conventions 

In discussing probability distributions, the underlying set is denoted by (g). In discussing channels, (8)in ((Xiout) 
denotes the totality of the inputs (outputs). In this paper, they are either {1, • • • , fc} or M''. x,y, etc. denotes 
an element of (g)in ,(8'out, <8). Also, = {xi,X2, ■ ■ ■ ,Xn), ~ (2/1,^2, • • • ,yn), etc. denotes an element of 

,Cu"t ■ 

Random variable taking values in (g), (g)in ,®out are denoted by X , Y , while random variable taking values 
in (g)^", (g);^" ,(g)o "t are denoted by X", F". The dsitribution of X is denoted by P^, while its density (with 
respect to lebesgue measure or counting measure depending on the underlying set) is denoted by px. Px|y 
and 'Px\Y denotes the onditional distribution and its density, respectively. In this paper, the existence of 
density with respect to a standard underlying measure (counting measure for {1, • • • , fc}, and Lebesgue 
measure for R'' ) is always assumed. Hence, by abusing the term, we sometimes say 'distribution p\ By V ^ 
Pin, and Pout we denote the totality of the probability density functions over ®, (gin, and (gout, respectively. 
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Channel $ is a linear map from probability distributions over to (8)in to those over (i^out, but also considered 
as a map from Li (®in) to Li (®out)- Hence, wc use notation such as $ (Px) , as well as $ (px)- The totality 
of channels is denoted by C If there is a need to indicate input and output space, we use the notation such 
as C ('Pin, 'Pout)- ^* denotes the dual map of 

1 / (x) $ (px) (x) d/z (x) = y (/) (x) px (x) d/z (a;) . 

A tangent space is denoted by a notation T(-)- ^' '^^c. denotes an element of Tp{V) (the tangent 
space to the set V at the point p) etc, w hile A, A' etc denotes an element of 7i (C) etc. In the paper, we 
identify S & Tp{V) with an element of in the form of c{p\ — P2), where pi, P2 & V . Hence, the differential 
map of $ is also denoted by by abusing the notation. L is a random variable defined by 

L{x) = ^, xen. 

P[x) 

and its low is under p, unless otherwise mentioned. Also, A is identified with a linear map in the form of 
c(*i - ^2), where *i, *2 € C. 

A pair {p,5} and {^',A} is called local data at p and respectively, since it decides local behaviour 
of one-parameter family of distributions at the point p and $, respectively. We denote by N (a,cr^) and 
(5N (a, cr^) the Gaussian distribution with mean a and variance and singed measure defined by 
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dx, 



respectively. Thus, the local data {N (a, ct^) ,^N (a, ct^)} describes local behaviour of Gaussian shift family 
{N(^,^')Wat^ = a. 

The symbol '(g)' means direct product of vectors. Given f G Li ((g)i) and g & L\ (02), f ®g is defined by 

f®g{xi,X2) = f{xi)f{x2). 

The linear span of {/ (g) 5} is denoted by Li (0i)(g)-L2 (<82)(= -^1 ('S'l x ^^2))- Also, given $1 € C ('Pin,i, 'Pout.i), 
$2 e C (Pin,2, Vont,2), $1 O $2 € C (Pin,i P'in,2, ^'out,! O ^out,2) IS defined by the relation 

*i ®^2{f®g) = $1 (/) ® ^2 {g) 

and linearity. For a real valued random variable Fi and F2 over Cli and 02, respectively, Fi 0F2 is a random 
variable over Oi x fl2 with 

Fi:»F2{xi,X2) = F{xi)F{x2). 
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We use abbraviations such as ^^/'X'/®---®/, and 

L(") L ® + 1 «) L 1^""^ + • • • + 1^""^ «) L, 

A(") a «) + $ (g) A (g) + . . . + (g, A e r$ (C®") , 

{$,A}®" := |$»",A(")}, 

{pi,Si} ® {^2,(52} {pi (Xi P2-,5i®P2 +Pi® ■ 

denotes, for a (singed) measure, total variatfon, and for a function, Li-norm. \\-\\^\^ denotes completely 
bounded norm: for a linear map A form signed measures (L^-functions ) to signed measures (L^-functions), 

p\ probability distributions 

(Here note A may not be a Markov map, i.e., may not map a probability distirbution to another dsitribution.) 

gp {5) and (A) denotes a metric, or square of a norm in Tp (V) and 7$ (C), respectively. In the present 
paper, they are not necessarily Riemanian. A probability distribution p is identified with the Markov map 
which sends all the input probability distributions to p, so that notations such as Gp (d) makes sense. Jp (S) 
denotes Fisher information, 

Jp{S) :-E{L}^= I {L{x)fp{x)dt,{x)^ J i^Mldf^ix) 

Finally, $ (-la;) S Pout is the distribution (, or its density) of the output when the input is x. Also, with 
A = c($i-$2), 

A(.|x) :=c(<i>i(.|a;)-$2(-|a;)) Grp(Pout). 

3 Probability distributions 

Cencov had proven uniqueness (up to the constant multiple) of the monotone metric in the space of classical 
probability distributions defined over the finite set. In the proof, it is essential that the metric is Riemanian, 
i.e., induced from an inner product. As will be noted in Theorem ll61 however, this assumption is not 
compatible with monotonicity in case of channels. Hence, we dispense with this assumption, and, instead, 
introduce new axioms which rules asymptotic behaviour of a metric. 

3.1 Axioms for the metrics of probability distributions 
(MO) 5p('5)>ff*(p) (*(<5)). 
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(AO) lim„^oo ^ 9p^- ('^^"0 = 9p {5). 

(CO) If and ^ - then 

lim - (59" (^'") - ^p*-- (<5("))) >0. 

(NO) (Normalization) In case {p, 5} = {N (0, 1) , (0, 1)} (a Gaussian shift family), 

9p {5) = 1. 

3.2 Simulation and asymptotic tangent smulationrdefinition 

Simulation of {pe} is the pair {g^i, A} with 

and tangent simulation of the local data 5} is the pair {g, 5' , A} with 

p = A(9), <5 = A(5'). 

If in addition there is A' with 

q = A' (p) , d' = A' (S) , 
we say {p, 5} and {q, S'} are equivalent, and express this relation by the notaton 

{p,S} = {q,6'}. 

An asymptotic tangent simulation of means a sequence {(7", (5'", A"}^-,^ of triplet of a prob- 

ability density g", an L^-function 6'" with J J'^d/z = 0, and a Markov map A", such that 



lim 



lim lip®"- A"(g")|L =0, 



= 0. 



(1) 

(2) 



We call maxdlp®" — A" (q'")||i , ||p®" — A" (g")||;^} the error of the asymptotic tangent simulation. In all 
the cases treated int the present paper, the following stronger conditions are satisfied: 



||p0n_A" (g")||^<_C({p,5}), 
_ A" (^'") <J-C{{p,5}). 



1 n 



(3) 

(4) 



Below, C {{p, S}) is sometimes denoted by C, as long as no confusion is likely to arise. 
Proposition 1 Let L (x) := S (x) /p (x). Then, 

{p,d} = {pL {i),iPL m- 



Proof. Observe 



p{x)dti{x) =pL (0, 

x:L{x)—l 

S{x)dii{x)^ / L {x) p (x) dn (x) 

x:L(x)—l J x:L{x)—l 

= 11 p{x)dfi {x) = IpL (0 ) 

Jx:Hx)=l 

where /i is either Lebesgue measure {ft = R'') or counting measure (il = {1, • • • , k}). Also, 

p{x), {l = L{x)) 



Vx\L (a^lOPL (0 
Vx\l{x\1) {IVL {I)} 



0, otherwise 
L{x)p{x) = 5{x) , {l = L{x)) 



0, otherwise 

Therefore, letting be a measure induced from ^ via change of the variable / = 5 (x) /p (x), 

Vx\L PL (0 (0 =P{x), 
Vx\l{x\1){Ij>l {l)}du{l) = 5{x). 



Lemma 2 Let L" := 5''^/q^, and suppose that = pi'>i . Let L" he a random variable defined 
obeying the distribution 

(n :^A"(pL,„)(n 

:= j P"(?"|r")pL- (/'")d/'". 

Define A" &?/ 

A" (q) (x") j A" (q) (P) p^„|^(„, dr. 
For A"};^^^ to satisfy (Oj and if sufficeas that 

C 



and 



where 











max 


E 







n V 

2C" + 3o < C. 



< a < oo. 
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Proof. Since Px"|l(") (^|^^"^) < 1, © implies 

||p«"_A"(g«)||^= sup EP;,„|i(„ 
.4:mcasurablG 

which is ([3]). By Chebychev's inequahty, 
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Lin) 


1 


Lin) 


> 






' \/n 







-EI 



n 



< 



c 



an 



^1/4 



Also, 







1 








A" 




)(/") = 



jfn r)n ( in \ ifn \ 



= E 



L'"|L" = r'lp£,. (/") 



Therefore, by Propositionll] 
1 



< 



_ A" (5'") 

i 

rpi(„, (r)-A" 

<^ / |rpi<„, r)-z"Pi„ r)|dr 

rp^„ (z") - E [l'" |i" = r 1 p^„ (z") dz" 
|rpi(„, a")-z"Pi„ r)|dz" + ^ 



< 



1 
1 

\/n 
1 

1 

1 



< E' 







1 


Lin) 
















\Jn 







2a a 



< 2rii/4 - A" + 
^ 3a + 2C' ^ C 



3a 



1/4- 



Proposition 3 Suppose there is an asymptotic tangent simulation of {p2_^i,S^_^i} by with the 

error fi (n) . Then, if k is a constant ofn, there is an asymptotic tangent simulation of{p^^,S^} by 
with the error X^iLi^ fi {^)- 

Proof. Obvious thus omitted. ■ 
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3.3 Simulation of probabiltiy distribution family: a background from decision 
theory 

Concept of simulation has been discussed in the field of statistical decision theory in relation with the notion 
of sufficiency |18| . Consider families £ = {pe}ege , ^ — {gejgge of probability distributions, and a function 
e : 9 ^ e{9) > 0. Also, let (D,!)) be a decision space. Then is said to be e-deficient relative to £ 
if, for any loss function Wg with \Wg {d)\ < 1 and for any decision function d : x d{x) G D, there is 
d' : y ^ d' (y) € D with 

' qe (y) We {d' (y)) dfi' < [ pe (x) We {d (x)) d^ + ee. (7) 



O-defHciency is simply called deficiency. The celebrated randomizing criteria, a necessary and sufficient 
condition for e-defficiency is the existence of A with 

\\pe - A {qe)\\ < ee- 

Especially, 0-deficiency is equivalent to that Y ^ qe is a sufhcient statistic of £ = {pe}e,^Q- Thus, e-defficiency 
is an approximate version of suffuciency. 

This randomizing criteria motivates our emphasis on simulation. Its 'local' version 

sup \\pe - A {qe)\\ = 
e 

dpe K f 9qe\ , 

is called local e-deficiency at 6. 
3.4 Gaussian shift family 

Proposition 4 Suppose {p,5} — {N (0, cr'^) , (5N (6*, cj-^) } . Suppose also (MO), and (NO) holds. Then we 
have 

9p (^) = ^ = Jp i^) ■ 

Proof. By an affine coordinate change of the data space £7 = R, {p, 5} — {N {0, a) , {9, cr)} is transformed 
to {q, 5'} — {N (0, 1) , ^<5N (0, 1)}. Its inverse coordinate transform coordinate change of the data space Q. 
sends {q,5'} to {p,5}. Therefore, by (MO) and (NO), 

9p {5) = 5n(o,i) (^'^N (0, 1)^ - ^5n(o.i) m (0, 1)) = ^. 



Remark 5 Similarly, one can pro?;e {N (0, 1) , (0, 1)}®", {N (O, i) , (5N (O, i) }, and {N (0, 1) , v^(5N (0, 1)} 
are equivalent. 
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3.5 On local asymptotic normality 

Asymptotic tangent simulation by Gaussian shift is somewhat analogous to so-called local asymptotic nor- 
mality (LAN, in short) [16 . Difference between them are as follows. First, asymptotic tangent simulation 
is concerned only with a particular point p, while LAN is concerned also with its neibourhood. On the 
other hand, ([3]) for asymptotic tangent simulation is norm convergence, and thus obviously stronger than 
convergence of -^L^") to N (0, J) in law. 

3.6 Zero bias transform 

Let X be a real valued random variable with the distribution Px • Then, 

'■= Yjx) f_ - ^'^y^ 

satisfies / Wx {x) dy = 1, and thus defines a random variable X° . The map from X to X° is called called 
zero-bias transform fS^m^ f lS f ^fSf. The following lemmas are proved in the literatures mentioned above. 

Lemma 6 Suppose < W (X) < oo and E{X) — 0. Suppose also / : R R is absolutely continuous, 
dijferentiable, and E |/' (A'°)| < co , 

E(X/(X)) = V(X)E/'(X°). (8) 

Lemma 7 Let S := ELi where Xi,--- ,A:„ are IID with < N {Xi) < oo and E{Xi) = 0. Then, 
denoting convolution by *, 

S° ~ S — Xn + X° , 

Ws ^Wx * (pjf)*"~^ ■ 

Lemma 8 Let S :== aiXi+a2X2, where af+al 1. IfWxi (x) /Px, (x) < oo andE{Wxi {X) /px, (AT) - 1)^ < 
oo (i^l,2), 

E (Ws (S) /p5 (S) - if < ajE {Wx, (Ai) /px, (Ai) - if + a^E [Wx, (A2) /px, (A2) - 1)' 
Lemma 9 The random variable X° is supported on a subset of the convex hull of the support of X . 

3.7 Binary distributions 

Consider a family of binary distributions {pg}, where the data space is fl — {0, 1}. Letting A^i (x") be the 
number of 1 in the sequence x" — X1X2 • • • a;„, 

= A^i (x") {L (1) - L (0)} + nL (0) 
= a{A^i(.T")-7ip(l)}, 
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where a i (1) - i(0). 

We compose A" which satisfies © with {pi(„), i(")pi(,^)} = {p®",(5(")} := {pf",^"''} and 

{g", <5'"} {N (0, n ((5)) , nJp ((5) ,5N (0, nJp (<5))} 



^ |n (0, 1) , y'nJp (,5)<5N (0, 1)| ^ {N (0, 1) , <5N (0, i)f "'^^^ , 
by letting L" be the element of the set 

{a {ni — p (1) n) ; ni g N, ni < n} 

closest to L'" - N (0, n Jp {S)). 
One can easily verify 



E 



E 



l/~\2 1 9 1 

-E(L") <-E(L'")V-1 
n V / n n 

< Jp ((5) + - (af . 
n 



®, or 

|K"-A(,")||,<l|P...-Pi.|l.<;^^. 
is the direct consequence of TheoremfTOl below. Hence, by Lemma[2l the error of this tangent simulation is 



-YTT with 



+ 3 Jp (S) + 3 (ay + 3 |a| , 



VMS) 

which is continuous function of 6 (0) and p (0) is bounded on any compact region. 

Theorem 10 Let Xi, X2, • • • , Xn be the IID random variables taking values in {0, 1}, with Pr {Xi = 1} = 77. 
Denote its variance by , and define 

Y,, := 



Ina 



Suppose 



whe 



1 

nrr ^— ^ 

a = \ja:, 



A" 



1 1 
— ^ — 7=^ 

2y'na 2y'no 



and z runs over a subset of (Z — nvi) / y/na . Then, 



|Pr{5„e^}-Pr{N(0,l)e^}| < 



/no 
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Proof. Letting 



we have 



i^A {x) := / {XA (t) - Pr {N (0, 1) G A}) e-'dt, 



|Pr{5„e^}-Pr{N(0,l)e^}| 



IE ( ^"0-4 (Sn) - SnTpA (Sn) 



(") 



(Hi) 



E ( (Sn) - ^^A iS°J 



IE iXA (Sn) - XA (S:)) + E [Sn^A (Sn) " S^l^A {S:))\ 



= 2|E(x^ {S^)^Xa{S:))\ 

(iv) 



(v) 



(9) 



(10) 



where (i) and (in) are due to the definition (ii) is due to ([5]), (iv) is due to 

and (w) is due to Lenima[71 By definition, one can verify that X° ^ is uniform distirbution over [0, 1]. 
Also, 



\&{XA{Sn)-XA{Sn-Y^ + Y:))\ 

{n-1 ~j r n-1 

Pr jxi^. ^k]\{l-i^)-l\+ Pr = k-l 



k=0 



E 



fc=0 k i=l 

I 1 I n 

h- 2 



I n-k { , , ll Ifc/ 1 
1 1-^7 

1— 77n[_ 2J rju \ 2 



r?(l-7y) 



fc = I 2 = 1 > 



V 

n 



< 



which leads to the assertion. 



k i=l 



{r]{l-r])} 



1/2 



3.8 Distributions over the finite set 

Theorem 11 Suppose p is a probability distribution and 6 is a signed measure over a set D, with \ — k 
(k < oo). Let J := Jp (S), e > and 



{g", <5'"} := {n (0, 1) , y/^J + 7)S^ (0, 1)} ^ {N (0, 1) , SN (0, 1)} 



(8iTi(./+e 
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Then, we can compose A" with the error -^^ji, where A is a continuous function of {p (x) , 5 {x) ;x — 1^ - ■ ■ , fc — 1} 
and is hounded on any compact region. 

Proof. Since binary distributions can be simulated by Gaussian shift as in Subsection l3.71 due to Propositin|3l 
we only have to compose asymptotic tangent simulation {p, (5}^" by binary distributions. For that, we first 
asymptotically simulate {p,5}^^ by {pa^Sa}'^^ ® {p^, (5^}^"" , where {pa^Sa} and {pa^Sa} is defined over 
the binary set fi^ and and the set VLa with (fc — l)-elements, respectively. Then, by virtue of PropositinlSj 
inductive argument leads to asymptotic tangent simulation by binary distributions. 
Let 

fc-i 

Pa (0) :=p(fc), = 

x=l 

fc-1 „ 
5a{0) ■■=S{k), <5a(l) = V<5(x), ia := — 

PA (x) := ^"7^, 1, • • • , fc - 1), 

Pa (1) 

SAix) :=i(^-^^iiilf^,(x = l,...,fc-l), 

Pa{l) Pa{l) 

i^:-^, (x=l,...,fc-l), 

PA [X) 

ria ■■= n {pa (1) + e) (e > 0). 

Also, let X2 = XalXa2 ' ' ' Xan € ftf", = XaiXA2 ' ' ' X A-a € fif", X„ - _Pa, ^ PA, ^ pf", and 

^ p'a"- Denote by A^i (x") the number of 1 in the sequence x" = XaiXa2 ■ ■ ■ Xan- Also, we identify the 
pair {xa,XA) with x, by the correspondence 

il,x) {x = l,--- ,k-l), 
(0,#) x^k, 

where # stands for empty string. To define asymptotic tangent simulation, one define function F : fi*^" — > 
17®" such that 

' (TVi {x^J < na) , 

fc" {Ni{x^)>na). 



Using F, we define 



A" (r") (x") := ^ r" (y") 



:= A" {p'^") , J" A" (j^")) 
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Then, 



\\p"-p'^% = Yl 



where 



= 2 J2 P®"" 

x'^:N-i_{x2)>na 

< 2exp{-nCa,e}, 



Ca,s ■= Pa (0) In — — h Pa (1) In ■ 



Pa (0) - e 



>a(l)+e' 



and 
1 



5" - 



Also, 



1 

1 y/n 
2 



J^i(a:S)>"<> 



JVi(xj)>na 



< 



E 



x3:A''i(x3)>na 



nmax{|La (0)| , |La (1)|} + n 



max{|i„ (0)|,|L„(1)|} + 



max La ix) 

l<x<k-l 



max La {xai) 

l<XAl<fe— 1 



exp{-nCa,£}. 



1 (,^-i^ML^\^i^ML 

Pik) f'(^) 

,2 fe-1 1 



R(o)r 



+E 



K(1),5a(x)+5„(1)p^ (x)K 



Pa(0) ^Pa(l)PA(a;) 
K(0)}^ , {^„(1)}^ , n^V^i^£M!^A n^V^^A r ^ 

Ja(^a)+Pa(l) Ja(<5a). 



Analogously, one can compose an asymptotic tangent simulation of {pa, Sa}^^" by {pb, 5^}^^°' ®{pb, fe}'^"'', 
where {pb,5})} and {pb,5b} are defined over the binary set fi^ and and the set with (fc — 2)-elements , 
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respectively, where 



Pb (0) := PA {k - 1) 



fe-2 



Pb 



(1) ■.= Y.Pa{.x) 



nb ■■= ria {pb (1) + e) , 
JpA {^a) = ^p, (Sb) + Pb (1) {Sb) ■ 

Repeating this proscess recursively, by Proposition[31 one can asymptotically simulate {p, S}^" by 



(11) 



{{pz, Sz} is defined over {1}, thus is trivia)l with the error 



i—a j—A 



max Lj (x) 



l<x<k-l 



^exp{-nCi,e} , 



which is upperbouded by ■^^t/Tj where B is a continuous function of {p (x) , S (x) ;a: = l,-- - ,fc— 1}. Due to 
Subsectionl3Jl . (ITT|) can be simulated by 

{N (0, 1) , (5N (0, (g, {N (0, 1) , SN (0, ,g, 

•••®{N(0,l),(5N(0,l)}^"''-^^(*'-' 

= {N (0,1), (o^i)}'»^^(MS)+fie)) ^ 

where lim£_j.o / (e) = 0, with the error ^^74 , where B' is a continuous function of {p (x) , (5 (x) ; a; = 1, • • • ,k — 1}. 
Here, '=' is due to 

nJa {Sa) + naJb {h) + nbJc (<5c) H V nyj^ {5^) 

= nJa (Sa) + n {pa (1) + s) Jb i6b) + n {pa (1) + e) {pb (1) + e) Jc (Sc) 
y 

+ ---+nY[{p,il) + e)J,{6,) 

i—a 

= n ( Jp {6) + f (e)) 
where the last identiy is due to 

y 

Jp (S) = Ja (Sa) + Pa (1) Jb (Sb) + Pa (1) Pb (1) Jc (Sc) + ■■■ + Y[p^{l) J. (S.) ■ 

i—a 

Therefore, due to Proposition[31 we obtain an asymptotic tangent simulation of {p,S}'^"' by {9", (5'"} with 
the error , and the assertion is proved. 
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3.9 A continuous random variable with smooth density 

contin 

, and define i" = A"* (L'") := i'". 

(12) 



Theorem 12 Let Q = W. Suppose L (x) — S (x) /p (x) exists and is a continuous function of x. Let J 
Jp {S), {g", <5'"} := {n (0, 1) , V^SN (0, 1)} = {N (0, 1) , <5N (0, l)f 
Suppose 

^ — ; < CO, 

V JPLiL) J 

holds. Then, {g" , 5'",A"} satisfies and Thus, by Lemma![^ it satisfies ^ and 

Proof. The assertion is essentially the same as Theorem 2.3 of 114. For the sake of completeness, however, 
the whole argument is described below. Let 5„ := :^?=j-^^"''- In. the same way as the proof of TheoremlTUl 
we have 



|Pr{5„ e^}-Pr{N(0,l) e^}| 

= |E ixA (Sn) - XA (S:)) + E iS,,^A (Sn) ~ S^i^A {S°))\ 



<2||ps„ -Ps=||i = 2E 



PS„ (Sn) 



where ipA is defined by and thus \xipA < 1. Hence, it boils down to the evaluation of E ^ ^"(''g "j 
which, due to Lemma[8l is not larger than 



=E 



=E 



V J 



'iLi-t) PL {t)dt 
J PL (0 



- 1 



1 



E 



'J_^{-t) PL it)dt' 



JPL (0 

Hence, we have ([5]). Also, it is easy to verify 



E 



L'" \L" 



= 0, 



E {Lf = -E ( L" 
n 



Jp (S) . 



A trivial sufficient condition for (fT2l) is that the support of pL is bounded. Also, suppose 

— < PL (t) < —, {t< 3ti) 



t 

c 



t" 
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hold for some real constant a.;, bi, Ui (t — 1,2). Then, if y < ti, 

' H)pL(<)dt< -^y2 



Pi W J~oo ax ~ 2 ax 

and, due to (-t) (t) dt = 0, if y > ^2, 



1 /"^ 1 /■°° _ . 16. 



'2„,2 



(-i) PL (t) dt = / t PL (i) dt < -—y 



PL (y) J-oo PL (y) a2 - 2 a2 ' 

Hence, if 

min {ai, a2} > 4, 

we have ([T2l) . 

The following conditions are also sufficient: 

aie-l*l°^ <PLW<foie-l*l"\ < 3ti) , 

026-1*1°' < PL (i) < 626-1*1°' , (t > 3t2) (13) 
for some real constants Oi, 6^, and (i = 1,2) , with 

min {ai, 02} > 2. 

Then, if y < and y < —1, 

(-0PLWdt< r (-t)6-i*i°Mi 



1 



PL (y) 



V 



bx 



V 



< ^/ (-t)"^-'6-|*l°Mf 

bx 6-1^^1°' _ 61 1 

aie~l2^l°^ ai — 1 ai ai — 1 

Hence, if |y| is large enough enough, we have 

1 

TT / PL {t) < const. 

PL (y) J_oo 

The same is true for y > i2-case, and thus (fT3|) is another sufficient condition for (fT2|) . 

3.10 Simulation of Gaussian shift by an arbitrary IID sequence 

Suppose {q"\S'"} — {g**", where Jq {S') = J , is given. Suppose also that L' := ^ has density with 

respect to Lebesgue measure, and satisfies (jl2l) . Then, by Theorem ll21 we can compose asymptotic tangent 
simulation of 

{p«", <5(") } {N (0, 1) , SN (0, l)}^""^ = {n (0, 1) , V^SN (0, 1)} 
with = {g®",(5'(")}. 
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Meanwhile, instead, suppose |L'| < const, with probabihty 1. Then, by a given Let ^ q, and 

r'^N(o,i), 

11 1 " 

^L" = ^ (A")* f , y (L' (X') + eY') . 

Then L' (^*) +£F* has density with respect to Lebesgue measure, and satisfies Since Fisher information 
of pi„ equals 

TJ T1 7 

n{J-f{e)) (nm/(e) = 0). 



J-i+e2 l + £2j 
by Theorem ll2[ one can compose an asymptotic tangent symulation of 



{n (0, 1) , ^n{l- f{e))J5^ (0, 1)} ^ {N (0, 1) , <5N (0, l)}^"^^- 



■/(e)) 



by {pj^„ {I) , ?P£„ (Ol- Since {A", p^/cn) (Z) , Iy>l'^") (01 an asymptotic tangent symulation of {p£„ (0 , /p£„ (0}: 
by Proposition[T]and Proposition|31 one can compose an asymptotic tangent simulation of {N (0, 1) , (5N (0, 
with = {(j®",(5'(")}. 

3.11 Uniqueness theorem 

Theorem 13 Suppose g satisfies (MO), (AO), (CO), and (NO). Suppose also either (a): {p, <5} is defined 
over a finite set, or (h): the probability density p^ of L with respect to Lebesgue measure exists and satisfies 
1112]) . Then, ifKp (L)'^ < oo, gp (6) equals J ^ Jp (S). 

Proof. Let := {N (0, 1) , (0, 1)}^"*"'+^' = |n (0, 1) , i/n ( J + £)JN (0, 1)| (e > 0). Then by 

PropositionHl 

g,. (<5'")-n(J + e). 

Due to TheoremfTTland Theorem[T2| there is A" with dS]) and (g]). Therefore, by (CO) and (MO), 

< lim - (gA(,„) (A(<5'")) - <?p«„ (s^^A) 

< lim -L"(<5'")-5p«"('5("0) 

n— ^-oo ri \ \ / / 

^ J {S)+e- lim -g„»^ (s^'^A . 
Similarly, by the argument in Subsection l3.10l we have, 

hsi \ (ffA(p»") (a (<5(")) ) - 5N(o,i) (. ^Jn{J-E)m (0, 1)^ 
lim - - {Jp (S) ~ e)) 

Therefore, 

Jp{S)-e< lim -.gp»„ (^f")) < li^ igpo™ ((J^"') < Jp [8) + ; 



< 



< 



£. 
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Since e > is arbitrary, we have 
which, combined with (AO) imphes 



Urn (<5(")) = Jp (S) , 



9p (^) = Jp i^) ■ 

We have to check g,, (6) = Jp (S) satisfies (MO), (AO), (CO), and (NO). (AO) and (NO) are checked by easy 
computation. (MO) is well-known. Hence, (CO) is shown in the sequel. We use the following characterization 
of Fisher information (see Chap. 9 of 2 ): 

\EpLTf 
Jrjid) = max J; ^„ , 

where the maximum is achieved by T = J-^ ■ L, with J ^ Jp {5). Define T" := (nJ)"^ • L(") , and 

r(x), (|T(x)|<a), 
0, (\T{x)\>a), 

T" (x") , (T" (x") < a) , 
0, (T" (.t") > a) . 



Ta [x) :-- 
T:{x) 



Observe 



1 lU'n 



o(l), 



where the last identity is due to — P \\i — > and ||(5' 



'^'"^lli ^ Observe also 



Ep»„ (T^y - Ep«„ (T")" = Ep»„ (T")' xt>a (r") 

2 



< 



1 1 



1 



1 



a 

4 



•J n [J a) 

1 1 /n— 1 1„,,4\ 

72^TT2 J+-Ep l M = o 1 . 

^ n ( Ja) V " / 



Similarly, 



^e„»„l(")t," - ^E„»„L(")r' 



^Ep«„i(")T"xt>a(T" 



1 



1 



a 



Therefore, 



Ep»„ (T") 
= Jp((5) + o(l). 



^ ^ 1 |E„8->L(")T"| 
-o(l) = --4- 



0(1) 



= o(l). 
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which is (CO). ■ 



3.12 On asymptotic continuity 

If for any {(7",(5'"} with - ^ ^ and ^ 



n r n 



\im — 

71— VOO ^ 



= 



holds, we say g is asymptotically continuous at {p^",(5'-"-'}. Analogous conditions are used in study of 
entanglement measures etc. In our case, Fisher information satisfies '>', or weak asymptotic continuity, as 
stated in Theorem ll3l However, the other side of inequality, and thus asymptotic continuity, is false. Let 
p :— Bin (1, t), and 



2 ' 



(z" = 0") 

g"(a;"):=<( (i _ t)" + (a;" = 1") 
p(^n j^^n-j ^ otherwise 

5(0) = -(5(1) = 1 > 0, 



then we have — p*^"!]! 



i" J- i" 

2 2 



0, ^ 



(5'" - 



= 0, and 



(1 - <)" (1 - tr + f " 



(1-i)" + I"} 



4 Classical Channels: Non-asymptotic theory 
4.1 Axioms 

Other than being square of a norm, G$ (A) should satisfy: 
(Ml) (monotonicity 1) (A) > G^o* (A o *) 
(M2) (monotonicity 2) (A) > G*o* (* o A) 
(E) G$«i (A ® I) = G$ (A) 
(N) Gp (<5) = Jp (<5) 
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4.2 Simulation of channel families 

Suppose we have to fabricate a channel $e, which is drawn from a family {$9}, without knowing the value of 
9 but with a probability distribution qg or a channel 5*9, drawn from a family {qe} or {^'e}- More specifically, 
we need a channel A with 

<I>e = Ao ge) , (14) 

Here, note that A should not vary with the parameter 9. Giving the value of 9 with infinite precision 
corresponds to the case where qg is delta distribution centered at 9. 

Differentiating the both ends of ([H]) and letting $g = $ and qg = q, we obtain 

A = Ao(I(^<5'), (15) 

where A e T* (C) and S' £ Tq {V). 

In the manuscript, we consider tangent simulation^ or the operations satisfying ()14|) and ()15|) . at the point 
$e = $ only. Note that simulation of {$, A} is equivalent to the one of the channel family {^g+t — ^ + ^^}f 

4.3 Relation between J and G 

In this section, we review quickly the properties of norms with (Ml), (M2), (E), and (N). For the proof, see 

Theorem 14 Suppose (Ml) and (N) hold. Then, 

G$ (A) > Gf^ (A) sup J*(p) (A {p)) - sup J$(.|,) (A O)) . 
Trivially, G^''' (A) safe/ies ^Mij, fM^j, and (N). 
Theorem 15 Suppose (M2), (E) and (N) hold. Then 

(A) < G^'' (A) := inf { J„ (5) ; A o (I » g) = $, A o (I (g) 5) = A } . 

A, (J, 5 

^/so, G|^^''(A) safe/?es ('Mij, (M2), (E), and (N). 

Obviously, G^™ (A) and G"^^^ (A) are not induced from any metric, i.e., they cannot be written as 
S (A, A), where S' is a positive real bilinear form. Indeed, we can show the following : 

Theorem 16 Suppose (Ml), (N), and (E) hold. Then, G$ (A) cannot written as Sit, (A, A), where S is a 
positive bilinear form. 
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5 Classical Channels: Asymptotic Theory 

5.1 Asymptotic Theory: additional axioms 

(A) (asymptotic weak additivity) lim„_j.oo -^G^iDn (A'"-*) — (A) 

(C) (weak asymptotic continuity) If ||$" - ^ and ^ ||A" - A(")||^^ then 

lim - fG$n (A") - G$«. f A("))) > 0. 

5.2 Asymptotic tangent simulation:deflnition 

We consider asymptotic version of approximate version of (jl4p -f??'): 

lim ||<i>«"(p)-A"(p®g")||^j^ = 0,Vp, (16) 

and 



lim 



A(«) {p) - A" (p ® 5") = 0, Vp , (17) 



cb 



with "program" {(?", J"}. Here, the larger one of ||$®" {p) - A" (p® g")!!^^ and ^ ||A(") (p) - A" {p®5'^)\\^^ 
is called the error of the asymtotic tangent simulation. 

5.3 Finite inputs 

In this sutbsection, i7in = {1, • • • , fc}. 

Theorem 17 Suppose {$ (-jx) , A (-jx)} satisfies all the conditions imposed on {p, (5} m Theorerr AHA Let 
us define := {N (0, 1) , (0, i)}»"(i+M(J+c) y;/iere J = Gf (A) = maxi<^<fc J$(.|^) (A (-Ix)) and 

e > 0, c > are arbitrary. Then, there is A" such that 

||$«"(p)-A"(p®g,")||^^<— 

(en) 

A(")(p)-A"(p®<5,") <— (18) 

(en) ' 



1 



where C is a function of {y\x) , A {y\x) ; x G rim, y G i^out}- Especially, if |ilout| < oo, this function is 
countinuous and bounded. 

Proof. Given the input sequence x" = xi • • • x„, denote the number of x in x" by N^. Suppose > en. 
Then, we use {N (0, 1) , (0, l)}«^^-("'+'=) for simulation of {$ (-jx) , A (•|x)}^^^ On the other hand, if 
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N^^ < en, wc first fabricate {$ (-la;) , A (•|a;)}®''" using {N (0, 1) , SN (0, and takes marginal. We 

do tliis for all a; = 1, • • • ,k. Since 

A; 

(g){N (0, 1) , (5N (0, = {N (0, 1) , (5N (0, i)}«>"a+ke){J+c) ^ 

x=l 

by Theoremim and TheoremfT^ we have and the proof is complete. ■ 

Theorem 18 Suppose {$ (-ja;) , A (-ja;)} satisfies all the conditions imposed on {p, (5} in Theoren> \13\ for all 
X e rJi„. Then, if a metric G satisfies (Ml), (M2), (E), (A), (C), and (NO). Then, 

G$ (A) = G^- (A) . 

Proof. Due to TheoremfMl we only have to show G$ (A) < G|J™ (A). Consider the simulation of {$, A} 
by {g", as of TheoremflTl Due to Theorem[T3l 

G,.{5-) = J,.{5-). 



Therefore, due to ([T8l) . we have 

< lim 

(C) n-^oo 



\ (Gao(i«,?) (Ao (I® <5^")) - G$«„ (a("))) 
< lim i f Gi»,. (I ® - G$«. f A("' 



G, 



(M) n-i-oo JT- 

= lim \ (g,, (S^) 
= lim i (jg^ (dl') - G$». (a(") 



c) — lim — G$s 

n— >oo Tl 



(A) 

Since e > and c > are arbitrary, we have 



< (1 + ek) (G'™" (A) 

(1 + ek) (Gr^ (A) + c) - G* (A) 



(a(")) 



G$ (A) < Gt"" (A) . 

Finally, we show G|^'" (A) satisfies (C). Let € rJi„ with J<e,(.|x) (A (-ja:*)) = G|^'" (A), and x'^ = a:*a:* • • • a;*. 
Then, since Gfl" (A'") > J*.>(.|^j) (A'" (-ja;^)), we have 

hm i (G^r(A'")-G^S^. (a("))) > lim i | J*„(.|,) (A'" (-ja;:^)) - J^(.|,)«„ (a (-ja;,)'"') | . 

The LHS of this is non-negative due to Theoreni[T3l since 

vl/"(.|a;:^)-$(.|a;,)^" 

A'"(.|a;:?)-A(.|a;*)("^ 



1 Vn 



A'" — a'"-* 



cb 



0(1), 
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5.4 Continuous inputs 

In this subsection, fli^ is a compact set in W^. Also, is usual 2-norni. 

Theorem 19 Suppose |fiout| < oo and 

max{|!<i>(.|x)-<i>(.|x')||i,||A(.|a;)-A(.|:r')|l}</(|lx-x'||) 

holds for some liuit^o f (t) = 0. Let us define {q^,S^} {N (0, 1) , (5N (0, where J = 

G^™ (A) — niaxi<a;<fe J^(^.\x) (A (-la;)) and e > 0, c > are arbitrary. Then, there is a family {$f , Afj^^p 
and I A", q"^, (5^" I such that 

l$f"(p)-Ar(p®<,)ll^,<^ (19) 

(20) 
(21) 

Proof. Let At C 51in = R'^ be the totality of lattice points such that uiiiix^yeAt ^ uW = t- Define 

$,(.|x) :=<i>(-|y), At(.|x) :=A(.|y), 

where y is the closest point in At to x. By assumption, {$t,At} satisfies (|2ip . By TheoremflTl we can 
compose Aj" with (HH) and ■ 



1 



Ar 



cb 


< 


Ct 
\/en 




< 


Ct 


cb 




(en)^ 


cb 




lim 1 



(C2) If limt^o - *|lcb = limt^o ||At - Ajj^^ = 0, limt^o (A^) - G$ (A) . 

Theorem 20 Suppose {$, A} satisfies all the asumptions of Theorem \19[ Then, if G satisfies (Ml), (M2), 
(E), (A), (C), (NO), and (02), 

G$ (A) = GT' (A) . 

Proof. Again, we only have to show G* (A) < Gf (A) . By Theorem[T9l G*, (Af ) = Gf^" (A*). Therefore, 
due to (C2), 

G$ (A) = lim G*, (AO = lim G™ (A^) . 
On the other hand, by construction of {$t. At}, 

G'iT i^t) < sup J^(.|,) (A i-lx)) = G^" (A) . 

Hence, we have the assertion. ■ 
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5.5 Quantum states as a classical channel 

A quantum state can be viewed as a channel which takes a measurement as an input, and outputs mea- 
surement result. Hence, if we restrict the measurements to separable measurements, the asymptotic theory 
discussed in this paper is applicable to quantum states also, proving the uniqueness of the metric. On the 
other hand, there are variety of monotone metrics, and lower asymptotic continuity is proven for some of 
them, e.g., SLD and RLD metric. This appearent contradiction can be circumvented by recalling that the 
theory of this paper is not applicable to the case of collective measurement. 
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